Unsupervised Dense Information Retrieval with Contrastive Learning

In this work, we explore the limits of contrastive learning as a way to train unsupervised dense retrievers and show that it leads to strong performance in various retrieval settings.

On the BEIR benchmark our unsupervised model outperforms BM25 on 11 out of 15 datasets for the Recall@100.

#MS_MARCO の話も

https://github.com/facebookresearch/contriever

we empirically evaluate our best retriever trained with contrastive learning, called Contriever (contrastive retriever), which uses MoCo with random cropping. (4)

Contriever is trained with contrastive learning on documents sampled from a mix between Wikipedia data and CCNet data (4.1)